Section 7/8 Question 7.1

a) Figure 1 shows the timeseries of the books data. The paperback series have change in levels more so than a trend. The hardcover series have a subtle upward trend. The saw tooth pattern in data shows seasonality, but with varying levels of peaks.

data("books")
autoplot(books,facet = T) + theme_bw() + geom_smooth(se = F)
## `geom_smooth()` using method = 'loess'
Time series of books

Time series of books

b, c,d and e.

Figure 2 shows the effect of alpha on the SSE (in sample). In the Paperback and Hardcover data when the intial values are set to “simple” we see that as the alpha increases the SSE decreases, that is the forecasts are closer to the actuals. However when the intial values are set to optimal, it is seen that in the Paperback data, the SSE decreases and then increases with increase in alpha.

The optimal alpha value is shown in Figure 2 as a triangle. When the initial values are obtained optimally and the alpha is selected optimally, then minimal SSE is obtained at relatively smaller values of alpha.

sesstats <- function(alpha,initial,name){
  data <- books
  fit <- ses(data[,name],alpha = alpha, initial = initial)
  stats <- data.frame(dataset = name,
                      initial = initial,
                      alpha = alpha,
                      SSE = (accuracy(fit)[2])^2*length(data[,name])
                      )
}


arggrid <- tibble(alpha = rep(seq(0.1,0.2,by= 0.02),4),
                      initial = rep(rep(c("simple","optimal"),each = 6),2),
                      name = rep(c("Paperback", "Hardcover"),each = 12)
                      )
SSE_Results <- purrr::pmap_df(arggrid,.f = sesstats)

SSE_Results %>% 
  dplyr::mutate(dataset = as.factor(dataset)) %>% 
  ggplot(mapping = aes(x = alpha, y = SSE,color = initial)) + geom_point() + geom_line() + facet_wrap(~dataset,nrow = 2,scales = "free") + theme_bw()
SSE vs alpha by simple and optimal initial values

SSE vs alpha by simple and optimal initial values

ses.Paperback.simple <- ses(books[,"Paperback"],initial = "simple", h = 4)
ses.Paperback.optimal <- ses(books[,"Paperback"],initial = "optimal", h = 4)
ses.Hardcover.simple <- ses(books[,"Hardcover"],initial = "simple", h = 4)
ses.Hardcover.optimal <- ses(books[,"Hardcover"],initial = "optimal", h = 4)

EstAlphaModels <- list(ses.Paperback.smpl = ses.Paperback.simple,
                       ses.Paperbacl.Opt = ses.Paperback.optimal,
                       ses.Hardcover.smpl = ses.Hardcover.simple,
                       ses.Hardcover.Opt = ses.Hardcover.optimal)


extractAlpha <- function(mdl){
  mdl %>% .$model %>% .$par %>% head(1)
}

dataset <- rep(c("Paperback","Hardcover"),each = 2)
initial <- rep(c("simple","optimal"),2)
alpha <- t(purrr::map_df(EstAlphaModels,extractAlpha))
RMSE <- t(purrr::map_df(EstAlphaModels,.f =  accuracy)[2,])
SSE <- RMSE^2*length(books[,1])

OptimalAlpha <- data.frame(dataset, initial,alpha = alpha,SSE,row.names = NULL)
SSE_Results <- rbind.data.frame(SSE_Results,OptimalAlpha) %>% dplyr::mutate(OptimalAlpha = c(rep(0,nrow(SSE_Results)),rep(1,nrow(OptimalAlpha))))


SSE_Results %>% 
  dplyr::mutate(dataset = as.factor(dataset),
                OptimalAlpha = as.logical(OptimalAlpha)) %>% 
  ggplot(mapping = aes(x = alpha, y = SSE,color = initial, shape = OptimalAlpha)) + geom_point() + geom_line() + facet_wrap(~dataset,nrow = 2,scales = "free") + theme_bw()
SSE vs alpha by simple and optimal initial values

SSE vs alpha by simple and optimal initial values

par(mfrow = c(2,2))
plot(ses.Paperback.simple, main = "SES - Paperback, optimal alpha, initial : simple",cex.main = 1)
plot(ses.Paperback.optimal, main = "SES - Paperback, optimal alpha, initial : optimal",cex.main = 1)
plot(ses.Hardcover.simple, main = "SES - Hardcover, optimal alpha, initial : simple",cex.main = 1)
plot(ses.Hardcover.optimal,main = "SES - Hardcover, optimal alpha, initial : optimal",cex.main = 1)
Forcasts with optimal alpha

Forcasts with optimal alpha

Figure 3 shows that the forecasts are identical. The exponential smoothing methods rely on the actuals for forecasting the next period. Hence the forecasts are identical.

It can be seen that the optimal settings of alpha and initial values have reduced the SSE.

Question 7.2

a) Figure 4 shows the forecasts for the next 4 days for the sales of Hardcover and paperback computed using Holt’s method.

holt.paperback <- holt(books[,"Paperback"], h = 4)
holt.hardcover <- holt(books[,"Hardcover"], h = 4)
par(mfrow = c(1,2))
plot(holt.hardcover,main = "Holts method - Hardcover", cex.main = 1)
plot(holt.paperback, main = "Holts method - Paperback", cex.main = 1)
Holts method forecast for Hardcover & Paperback

Holts method forecast for Hardcover & Paperback

SSE_holt <- data.frame(dataset = c("Paperback", "Hardcover"),
                      SSE = c(sw_glance(holt.paperback %>% .$model)$RMSE^2*length(books[,"Hardcover"]),
                              sw_glance(holt.hardcover %>% .$model)$RMSE^2*length(books[,"Paperback"]))
                      )

SSE_holt_ses <-
  rbind.data.frame(
  SSE_Results %>% dplyr::filter(OptimalAlpha == 1) %>%
  dplyr::filter(initial == "optimal") %>%
  dplyr::select(dataset, SSE),
  SSE_holt
  ) %>%
  dplyr::mutate(Model = rep(c("Exponential Smoothing", "Holt"), each = 2))

Forecasts <- cbind(SSE_holt_ses,t(sapply(list(ses.Paperback.optimal,ses.Hardcover.optimal,holt.paperback,holt.hardcover), function(x) data.frame(forecast(x,h=1))))) %>% dplyr::arrange(dataset)

knitr::kable(Forecasts, caption = "Holts vs simple exponential smoothing forecasts")
Holts vs simple exponential smoothing forecasts
dataset SSE Model Point.Forecast Lo.80 Hi.80 Lo.95 Hi.95
Hardcover 30587.69 Exponential Smoothing 239.5602 198.639 280.4815 176.9766 302.1439
Hardcover 22581.83 Holt 247.3504 212.1899 282.5109 193.577 301.1237
Paperback 33944.82 Exponential Smoothing 207.1098 164.0013 250.2182 141.1811 273.0384
Paperback 30074.17 Holt 207.038 166.4617 247.6143 144.982 269.0941

b) Table 1 shows the forecasts of both the simple and double exponential (Holt’s) forecasting methods. The SSE of the Holt’s method is less compared to simple exponential smoothing. The holts method is able to predict the slope or trend in the data.

c) The forecasts of Holt’s method is better for the reason mentioned above.

d)

Question 7.3

The holts method for several alpha and beta experiemnts with exponential trend turned on and off and damping turned on and off. Figure 4 shows the contour plot of RMSE vs alpha nad beta when exponential trend is turned on and off with damping and no damping.

The RMSE is insensitive when the damping is set to false and alpha is greater than 0.3. Which is an interesting observation. The best and worst performing model are plotted below (Figure 5).

data("eggs")
inputs <- expand.grid(alpha = seq(0.001, 0.5, length.out = 10),beta = seq(0.0001, 0.05, length.out = 10),exponential = c(T,F), damped = c(T,F))

inputs %<>% dplyr::filter(alpha > beta)

holtfct <- function(alpha,beta, exponential, damped){
  fcts <- holt(eggs,h= 100,alpha = alpha, beta = beta,exponential = exponential,damped = damped)
  result <- fcts %>% .$model %>% sw_glance() %>% 
    dplyr::mutate(alpha = alpha, beta = beta,exponential = exponential, damped = damped )
  list(result = result,model = fcts)
}

safe_holtfct <- purrr::safely(holtfct)
Results <- purrr::pmap(inputs,safe_holtfct)
# removing those inputs that error out

inputs <- inputs[which(sapply(1:364, function(x) is.null(Results[[x]]$error)) == T),]

# re do model fit

Results <- purrr::pmap(inputs,holtfct)
Accuracy <- lapply(1:273, function(x) Results[[x]]$result)
Accuracy <- do.call("rbind",Accuracy)

dampF <- Accuracy %>% dplyr::filter(damped == F) %>%  ggplot(data = .,aes(x = alpha, y = beta, z = RMSE)) + geom_contour(aes(color = ..level..)) +  scale_color_gradient(low="blue", high='red')+ facet_wrap(~exponential)

dampT <- Accuracy %>% dplyr::filter(damped == T) %>%  ggplot(data = .,aes(x = alpha, y = beta, z = RMSE)) + geom_contour(aes(color = ..level..)) + scale_color_gradient(low="blue", high='red') +
  facet_wrap(~exponential)

gridExtra::grid.arrange(dampT,dampF)
Top: Left:Contour plot of RMSE vs alpha & beta with NO damping when exponential trend is off, right : When exponetial trend is ON; Bottom: Contour plot of RMSE vs alpha & beta with  damping when exponential trend is off

Top: Left:Contour plot of RMSE vs alpha & beta with NO damping when exponential trend is off, right : When exponetial trend is ON; Bottom: Contour plot of RMSE vs alpha & beta with damping when exponential trend is off

par(mfrow = c(2,1))
plot(Results[[which.min(Accuracy$RMSE)]]$model,PI = F, main = "Alpha = 0.5, Beta= 0.0001, Exponential = F, Damping = F", cex.main = 0.6)
plot(Results[[which.max(Accuracy$RMSE)]]$model, PI = F, col = "red", main = "Alpha = 0.06, Beta= 0.03, Exponential = F, Damping = F",cex.main = .6)
Top: Best model; Bottom: Worst model

Top: Best model; Bottom: Worst model

Question 7.4

a) The ukcars data is non-staionary with seasonlity, trend and changes in level. Figure 6 shows the time series plot of the same

data("ukcars")
plot(ukcars)
Time series of ukcars

Time series of ukcars

b) The ukcars data is decomposed using the stl() method. Figure 7 shows the decomposition.

stlfit <- stl(ukcars,s.window = "periodic", robust = T)
autoplot(stlfit)
STL decomposition of ukcars

STL decomposition of ukcars

seasonalAdj <- stlfit$time.series[,"trend"]

c) Figure 8 shows the additive damped model forecast on the seasonally adjusted data as obtained from stl(). The summary of the model is shown below.

holtfit <- holt(seasonalAdj,h = 8,damped = T,exponential = F)
summary(holtfit)
## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = seasonalAdj, h = 8, damped = T, exponential = F) 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 0.9999 
##     phi   = 0.8 
## 
##   Initial states:
##     l = 329.7418 
##     b = -2.964 
## 
##   sigma:  5.511
## 
##      AIC     AICc      BIC 
## 931.9176 932.7100 948.2819 
## 
## Error measures:
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 0.1347103 5.510954 4.057228 0.06727714 1.273959 0.2068624
##                   ACF1
## Training set 0.1835495
## 
## Forecasts:
##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       402.8698 395.8072 409.9323 392.0685 413.6710
## 2005 Q3       401.0937 386.5521 415.6353 378.8543 423.3332
## 2005 Q4       399.6729 377.1260 422.2198 365.1904 434.1555
## 2006 Q1       398.5363 367.8288 429.2437 351.5733 445.4992
## 2006 Q2       397.6269 358.8134 436.4405 338.2667 456.9872
## 2006 Q3       396.8995 350.1527 443.6463 325.4064 468.3925
## 2006 Q4       396.3175 341.8756 450.7594 313.0558 479.5792
## 2007 Q1       395.8519 333.9858 457.7181 301.2358 490.4681
plot(holtfit, ylab = "Seasonally adjusted data")
Forecast of seasonally adjusted data using Holts additive method with damping.

Forecast of seasonally adjusted data using Holts additive method with damping.

# reseasonalize
seasoncomp <- stlfit$time.series[,"seasonal"]
seasonalcompfct <- snaive(seasoncomp, h= 8)
fcts <- holtfit$mean + seasonalcompfct$mean
reseasonlizedts <- ts(c(ukcars,fcts),start = c(1977,1), frequency = 4)
year <- c(rep(1977:2004, each = 4), rep(2005:2006, each = 4), 2007)
qtr <- c(rep(1:4,(2006-1977+1)),1)
data.frame(data = reseasonlizedts, year = year, qtr = qtr, year.qtr = year + qtr/10, forecast = c(rep(F,113),rep(T,8))) %>% 
  ggplot(aes(x = year.qtr, y = data, color = forecast)) + geom_line()
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
reseasonilized forecasts

reseasonilized forecasts

#autoplot(reseasonlizedts, col= c(rep("grey",113),rep("blue",8)))
RMSE_Holtdamp <- sqrt(sum((ukcars - holtfit$fitted + seasonalcompfct$fitted)^2,na.rm = T)/113)
accuracy(holtfit)
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 0.1347103 5.510954 4.057228 0.06727714 1.273959 0.2068624
##                   ACF1
## Training set 0.1835495

Figure 9 shows the reseasonlized forecasts and the RMSE for the model is 57

d)

Figure 10 shows the reseasonalized forecasts using linear Holt’s method and the RMSE is 56.88. Subtly better than the forecast with damping.

holtlinear <- holt(seasonalAdj,h = 8,damped = F,exponential = F)
fcts <- holtlinear$mean + seasonalcompfct$mean
reseasonlizedts <- ts(c(ukcars,fcts),start = c(1977,1), frequency = 4)
year <- c(rep(1977:2004, each = 4), rep(2005:2006, each = 4), 2007)
qtr <- c(rep(1:4,(2006-1977+1)),1)
data.frame(data = reseasonlizedts, year = year, qtr = qtr, year.qtr = year + qtr/10, forecast = c(rep(F,113),rep(T,8))) %>% 
  ggplot(aes(x = year.qtr, y = data, color = forecast)) + geom_line()
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
Reseasonlized forecasts - Holt's linear

Reseasonlized forecasts - Holt’s linear

RMSE_holtlinear <- sqrt(sum((ukcars - holtlinear$fitted + seasonalcompfct$fitted)^2,na.rm = T)/113)
accuracy(holtlinear)
##                      ME     RMSE      MAE           MPE     MAPE      MASE
## Training set -0.1504732 5.056389 3.901235 -0.0008004432 1.218107 0.1989089
##                   ACF1
## Training set 0.3659724

d, e and f) Figure 11 shows that decomposition using ets(). The resulting (A,N,A) model shows an accuracy of 25.25, which is alomost half of the previous 2 models. The ets model is most reasonable in this case.

ets.fit <- ets(ukcars, model = "ZZZ")
autoplot(ets.fit)
ETS model fit for ukcars

ETS model fit for ukcars

plot(forecast(ets.fit, h = 8))
ETS model fit for ukcars

ETS model fit for ukcars

Comparing the models A,N,A models seems to be a better model from a RMSE perspective.

7.5

a) The data is a non stationert time series data with seasonality and trend. The variation on seasonal values increases with the level of the series.

data("visitors")
autoplot(visitors)
Time series of visitors

Time series of visitors

b & c) The holts winter forecast for the next two years is shown in the plot below.

The holt’s winter method is necessary here because the variation on seasonal values increases with the level of the series.

hw.m.fit <- hw(visitors,seasonal = "multiplicative",h = 24)
plot(hw.m.fit)
Holts - winters forecast

Holts - winters forecast

d & e)

Figure 14 shows the experimentation with the settings of expoential & damped. Based on table 2, the Damped HW multiplicative method gives the best RMSE.

par(mfrow=c(2,2))
plot(hw.m.fit,cex.main = 0.7)
plot(hw(y = visitors,seasonal = "multiplicative", exponential = T),cex.main = 0.7)
plot(hw(y = visitors,seasonal = "multiplicative", exponential = T, damped = T),cex.main = 0.7)
plot(hw(y = visitors,seasonal = "multiplicative", exponential = F, damped = T),cex.main = 0.7)
Experimentation with damping and exponential

Experimentation with damping and exponential

Models <- list(hw.m.fit = hw.m.fit,hw.m.exp = hw(y = visitors,seasonal = "multiplicative", exponential = T),
               hw.m.exp.damp = hw(y = visitors,seasonal = "multiplicative", exponential = T, damped = T),
               hw.m.damp = hw(y = visitors,seasonal = "multiplicative", exponential = F, damped = T))
df <- do.call("rbind",purrr::map(Models, ~sw_glance(.x$model)))

knitr::kable(df, caption = "Model Summary & Performance")
Model Summary & Performance
model.desc sigma logLik AIC BIC ME RMSE MAE MPE MAPE MASE ACF1
hw.m.fit Holt-Winters’ multiplicative method 0.0549701 -1299.795 2633.589 2692.760 -0.9498442 14.82950 10.96716 -0.8150922 4.271167 0.4050069 0.2223887
hw.m.exp Holt-Winters’ multiplicative method with exponential trend 0.0555770 -1299.884 2633.767 2692.938 0.6442177 14.49416 10.62951 0.2554469 4.032800 0.3925378 0.0759579
hw.m.exp.damp Damped Holt-Winters’ multiplicative method with exponential trend 0.0544229 -1295.373 2626.745 2689.397 0.7230142 14.45533 10.72791 0.0379870 4.090931 0.3961716 0.0121817
hw.m.damp Damped Holt-Winters’ multiplicative method 0.0542224 -1294.409 2624.818 2687.469 0.9123468 14.44801 10.64909 0.0707184 4.064322 0.3932608 0.0174064

f) Figure 15 shows the forecastss from the various models. Table 3 shows the respectinve RMSE. The hybrid model of STL-ETS on Box-Cox transformed data is the best from the RMSE perspective, however the residual signature makes it a poor model. Both the ETS models seems to work well for this data.

ets.mdl <- ets(visitors)
ets.additive <- ets(visitors,additive.only = T,lambda = BoxCox.lambda(visitors))
snaive.mdl <- snaive(visitors,lambda = BoxCox.lambda(visitors))
stl.decop <- stl(x = BoxCox(visitors,BoxCox.lambda(visitors)),s.window = "periodic",robust = T)
ets.seasadj <- ets(stl.decop$time.series[,"trend"])
stl.ets <- ets.seasadj$fitted + stl.decop$time.series[,"seasonal"]
stl.ets.fitted <- InvBoxCox(stl.ets,BoxCox.lambda(visitors))
stl.ets.res <- visitors - stl.ets.fitted

# generate forecast

fct.ets.seasonadj <- forecast(ets.seasadj,h = 24)
fct.stl.season <- snaive(stl.decop$time.series[,"seasonal"],h = 24)
fct.stl.ets <- InvBoxCox(fct.ets.seasonadj$mean + fct.stl.season$mean,BoxCox.lambda(visitors))

fct.hw.m.fit <- forecast(hw.m.fit,h = 24)
fct.ets.mdl <- forecast(ets.mdl,h = 24)
fct.ets.additive <- forecast(ets.additive, h = 24)
fct.snaive <- forecast(snaive.mdl, h = 24)

# plot forecast

plot(visitors, col = "grey",xlim= c(1985,2008))
lines(fct.hw.m.fit$mean, col = "black", lty = 2)
lines(fct.ets.mdl$mean, col = "red", lty = 2)
lines(fct.ets.additive$mean, col = "green", lty = 2)
lines(fct.snaive$mean, col = "orange", lty = 2)
lines(fct.stl.ets,col = "blue", lty = 2)
legend("topleft",legend = c("visitors", "Holts-Winters forecast",
                            "ETS forecast", "ETS additive forecast",
                            "snaive forecast","stl-ets hybrid forecast"),
       lty = c(1,rep(2,5)), col = c("grey","black","red","green","orange","blue"),cex = 0.5)
Forecast of visitors

Forecast of visitors

# RSME generation 
mdls <- list(ets.mdl,ets.additive)
#sw_glance(snaive.mdl$model)
mdls_RMSE <- purrr::reduce(purrr::map(mdls,sw_glance),bind_rows) %>% select(model.desc,RMSE) %>% 
  dplyr::mutate(RMSE = round(RMSE,2)) %>% 
  rbind.data.frame(
    data.frame(model.desc = c("snaive","stl.ets"),
    RMSE = round(c(accuracy(snaive.mdl)[2], sqrt(mean(stl.ets.res^2))),2)
    )
    
  ) 

knitr::kable(mdls_RMSE,caption = "RMSE for model fits")
RMSE for model fits
model.desc RMSE
ETS(M,A,M) 15.86
ETS(A,Ad,A) 15.58
snaive 32.57
stl.ets 14.32
par(mfrow = c(2,3))
r1 <- checkresiduals(hw.m.fit)
Residual Analysis of models

Residual Analysis of models

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 47.794, df = 8, p-value = 1.081e-07
## 
## Model df: 16.   Total lags used: 24
r2 <- checkresiduals(ets.mdl)
Residual Analysis of models

Residual Analysis of models

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,A,M)
## Q* = 21.278, df = 8, p-value = 0.006444
## 
## Model df: 16.   Total lags used: 24
r3 <- checkresiduals(ets.additive)
Residual Analysis of models

Residual Analysis of models

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,Ad,A)
## Q* = 18.049, df = 7, p-value = 0.01175
## 
## Model df: 17.   Total lags used: 24
r4 <- checkresiduals(snaive.mdl)
Residual Analysis of models

Residual Analysis of models

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 468.11, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24
r5 <- checkresiduals(stl.ets.res)
Residual Analysis of models

Residual Analysis of models

Section 8, Q 8.5

a & b)

The auto regressive model of order k is given by

\(y_{t} = c + \phi_{1}y_{t-1} + \phi_{2}y_{t-2} + ... + \phi_{t-p}y_{t-p} + e_{t}\)

For AR(1) model …

$y_{t} = c + {1}y{t-1} + e_{t} $

Where \(e_{t}\) is normally distributed error with mean zero and variance (\(\sigma^2\)) 1. Here \(c\) is assumed to be 0.

AR1simul <- function(phi){
  set.seed(1)
  y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y[i] <- phi*y[i-1] + e[i]
plot(y, main = paste("AR(1) of phi = ",phi), cex.main = 0.7)
}
# y <- AR1simul(0.05)
# set.seed(1)
# ytest <- arima.sim(list(ar=0.05),n = 100)
phi <- c(-1.2,-0.3, -0.1, 0.2, 0.4, 0.6, 0.8,1, 1.2)
par(mfrow = c(3,3))
test <- lapply(phi,AR1simul)
AR(1) simulations

AR(1) simulations

It can be seen from figure 17 as the \(\phi_{1}\) increases in magnitude the series becomes non-stationery. When \(\phi_{1} = 1\), the series becomes a randomwalk.

arimsim <- function(ars) {
  sim <- arima.sim(list(ma = ars),n = 100)
  ts.plot(sim)
}

test2 <- lapply(phi[10],arimsim)

c & d )

Figure 18 shows that as the \(\phi_{1}\) changes different time series are generated.

MR1simul <- function(phi){
  set.seed(1)
  y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y[i] <- phi*e[i-1] + e[i]
plot(y, main = paste("MR(1) of phi = ",phi), cex.main = 0.7)
}
par(mfrow = c(3,3))
test <- lapply(phi,MR1simul)

e,f & g

ARMA(1,1) seems stationary, however the AR(2) is not.

ts.sim <- arima.sim(list(order = c(1,0,1), ar = 0.6, ma = 0.6), n = 100)
AR2simul <- function(phi1,phi2){
  set.seed(1)
  y <- ts(numeric(100))
e <- rnorm(100)
for(i in 3:100)
  y[i] <- phi*y[i-1] + phi2*y[i-2] + e[i]
plot(y, main = paste("AR(2) of phi1 = ",phi1," & phi2 = ",phi2), cex.main = 0.7)

}


par(mfror = c(2,1))
ts.plot(ts.sim, main = "ARMA(1,1) simulation")
ARMA(1,1) and AR(2) comparison

ARMA(1,1) and AR(2) comparison

AR2simul(-0.8,0.3)
ARMA(1,1) and AR(2) comparison

ARMA(1,1) and AR(2) comparison

Question 8.6

a & d)

Figure 21 shows the time series of wmurders. I would guess with the changes in level like stock data, it would need at least one differencing. A naive model with forecast equaling the previous data would likely be better. I am thinking either 1 regressive part or 1 moving average (may be both?) might work best.

data("wmurders")
autoplot(wmurders)

Here we’ll fit the ARIMA(1,1,0), ARIMA(0,1,1) and ARIMA(1,1,1). From the accuracy of the models, all the three models are close. From the residual perspective ARIMA(1,1,1) seems relatively better.

mdl1 <- Arima(wmurders,order = c(1,1,0))
mdl2 <- Arima(wmurders,order = c(0,1,1))
mdl3 <- Arima(wmurders,order = c(1,1,1))

purrr::reduce(.x = purrr::map(list(mdl1,mdl2,mdl3),.f = sw_glance),.f = bind_rows)
checkresiduals(mdl1)
Residual checks for wmirders

Residual checks for wmirders

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,1,0)
## Q* = 13.281, df = 9, p-value = 0.1503
## 
## Model df: 1.   Total lags used: 10
checkresiduals(mdl2)
Residual checks for wmirders

Residual checks for wmirders

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1)
## Q* = 13.122, df = 9, p-value = 0.1572
## 
## Model df: 1.   Total lags used: 10
checkresiduals(mdl3)
Residual checks for wmirders

Residual checks for wmirders

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,1,1)
## Q* = 11.266, df = 8, p-value = 0.1871
## 
## Model df: 2.   Total lags used: 10

b) Adding a constant adds a drift to the forecast. The time series is not contantly increasing / decreasing. Hence adding a constant may not make sense for this data.

c)

(1 - phi1B) (1-B)yt = (1 + theta1)et

e & f)

plot(forecast(mdl3, h = 3))
Forecast with Arima

Forecast with Arima

arimaauto <- auto.arima(wmurders)
accuracy(arimaauto)
##                       ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.01065956 0.2072523 0.1528734 -0.2149476 4.335214 0.9400996
##                    ACF1
## Training set 0.02176343
checkresiduals(arimaauto)
Auto Arima model

Auto Arima model

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,2,1)
## Q* = 12.419, df = 8, p-value = 0.1335
## 
## Model df: 2.   Total lags used: 10

g) The ARIMA(1,1,1) and ARIMA(1,2,1) are very comparable in its forecast accuracy.

8.7

a)

The austourists data has both seasonality and trend.

data("austourists")
autoplot(austourists)

## b & c) The ACF captures the trend and nearly shows all lags are significant … The PACF removes the effect of the trend and shows tha tthe seasonality (lag 4) shows up as the key correlation factor.

par(mfrow = c(1,2))
Acf(austourists)
pacf(austourists)
ACF & PACF austourists

ACF & PACF austourists

d)

plot(diff(austourists,nsdiffs(austourists,4)))
austourists

austourists

The seasonal differences still show a seasonality in the data. ARIMA(1,0,0)(1 or 4?,1,0) may work.

auto.arima(austourists)
## Series: austourists 
## ARIMA(1,0,0)(1,1,0)[4] with drift 
## 
## Coefficients:
##          ar1     sar1   drift
##       0.4493  -0.5012  0.4665
## s.e.  0.1368   0.1293  0.1055
## 
## sigma^2 estimated as 5.606:  log likelihood=-99.47
## AIC=206.95   AICc=207.97   BIC=214.09

f)

(1-phi1B) (1-B^4)yt = et

8.8

a) The time series shows increasing trend and the variation in peaks is proportional to time.

data("usmelec")

plot(usmelec,col = "grey")
lines(forecast::ma(usmelec, 12))
time series of usmelec

time series of usmelec

b)

lambda <- BoxCox.lambda(usmelec)
autoplot(BoxCox(usmelec,lambda))

The boxcox transformation has reduced the variability in the peaks and valleys. There is seasons in the data. Hence not stationary. A 12 month differencing might help, as shown below.

usmelectrans <- BoxCox(usmelec,lambda)
plot(diff(usmelectrans,12))
12 month differencing of transformed usmelec

12 month differencing of transformed usmelec

d & e)

The auto.arima model came out to be the better from AIC perspective and RMSE perspective. The residuals look like whitenoise.

fit1 <- Arima(usmelectrans, order = c(1,0,1), seasonal = c(1,0,1))
fit2 <- Arima(usmelectrans, order = c(2,1,1), seasonal = c(2,0,1))
fit3 <- auto.arima(usmelectrans)
mdls2 <- list(fit1,fit2,fit3)

knitr::kable(purrr::reduce(purrr::map(mdls2,.f = sw_glance),bind_rows), caption = "Model comparison")
Model comparison
model.desc sigma logLik AIC BIC ME RMSE MAE MPE MAPE MASE ACF1
ARIMA(1,0,1)(1,0,1)[12] with non-zero mean 0.0019580 2165.253 -4318.505 -4293.797 0.0001228 0.0019472 0.0015117 0.0062961 0.0778706 0.5753045 0.112863
ARIMA(2,1,1)(2,0,1)[12] 0.0018945 2179.817 -4345.634 -4316.823 -0.0000682 0.0018798 0.0014530 -0.0035267 0.0748420 0.5529837 -0.017035
ARIMA(2,1,2)(0,0,2)[12] with drift 0.0031415 1966.482 -3916.965 -3884.038 0.0000079 0.0031137 0.0025166 0.0001899 0.1296157 0.9577474 -0.018691
checkresiduals(fit3)
Residual check

Residual check

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,1,2)(0,0,2)[12] with drift
## Q* = 224.05, df = 17, p-value < 2.2e-16
## 
## Model df: 7.   Total lags used: 24

f & g)

Forecasts are reasonable in short terms and not on the long term. Forecast models with drift need to be used carefully.

fcts <- forecast(fit3, h = 180)
plot(fcts$mean)
15 year projection

15 year projection